The value of PROMs for predicting erectile dysfunction in prostate cancer patients with Bayesian network

Highlights • PROMs are more predictive than clinical information for endpoints like erectile dysfunction.• Patient perspective should be incorporated in outcome research, such as erectile dysfunction, since the patient understands the disease symptoms better.• Experts' involvement in clinical research is critical as it aligns the modeling steps with the clinical process for increased clinical plausibility.


Introduction
Erectile dysfunction, also called impotence, is the inability to develop or sustain an erection satisfactory for sexual intercourse [1].It is one of the most common types of sexual dysfunction and affects approximately 50 % of men above the age of 40 years and increases to about 70 % for men 70 years and older.Erectile dysfunction is primarily linked with prostate cancer and the psychiatric disorders caused by disease diagnosis [2,3].However, a large majority of erectile dysfunction is caused by the numerous treatment options of prostate cancer such as surgery (prostatectomy), radiation, and hormonal therapy because the nerves and blood vessels that control the physical aspect of an erection are incredibly delicate and easily affected by any change or trauma to the area [2].
Whether or not the nerves were spared during surgery or whether the most precise dose planning was used during radiation therapy, erectile dysfunction remains the most common side effect of prostate cancer treatment and the predominant concern for a significant majority of patients with localized prostate cancer, since it negatively impacts their quality of life [3].Therefore, knowing if a prostate cancer patient will have erectile dysfunction after treatment is pivotal for shared decisionmaking at the same time, choosing the optimal treatment can be challenging since each option comes with different side effects, erectile dysfunction included [4].
In recent years, caregivers have been making use of patient-reported outcome measures (PROMs) to evaluate patients' health and well-being during the treatment process [5,6].These PROMs capture patients' information such as symptoms, performance status, and quality of life which is based on the patient's perspective of their health at intervals [7].Caregivers can then leverage this information to improve shared decision-making and better patient-centered care [8].However, the amount of information available is more than humans can process, mostly because humans' ability to handle several cognitive entities at a time is very limited [9].This calls for support from predictive models that can better combine several important patient information to probabilistically determine the risk of them developing erectile dysfunction after treatment.
Verma et al. [10] have demonstrated that various models can be successfully applied to PROM information to predict short-and longterm outcomes to guide shared decision-making processes between clinicians and patients for better cancer care.In addition, Rivera et al. [11] have also stressed the importance of integrating PROM in healthcare outcome research studies since it allows for the incorporation of the patient's view of the disease symptoms, hence reflecting a more holistic picture of the patient's health and well-being for optimal decision.Our previous findings demonstrate that logistic regression can nicely combine PROM and clinical information to identify patients liable for developing erectile dysfunction [12].However, only two PROMs were included in the model, which does not provide the full predictive potential of PROM information.Moreover, no study has leveraged the power of algorithms such as Bayesian networks on PROM information, which can intricately represent the causal relations between the available variables and efficiently make probabilistic inferences about any variable of interest in the network with optimal performance [13,14].In addition, they have increased clinical plausibility if specified by experts in the field and are more performant when inferred from the available data (evidence from data) by a learning algorithm [13].
Therefore, this study aims to develop and externally validate a performant and clinically plausible Bayesian network structure to predict one-year erectile dysfunction in prostate cancer patients by complementing expert knowledge with evidence from data (expert-modified structure).
We hypothesize that though a Bayesian network can offer a more holistic view of a prostate cancer patient's treatment journey because it can incorporate more patient information and the interplay between these variables, a carefully developed Bayesian network structure using PROM information will outperform a structure developed from clinical information.

Materials and methods
This study analyses a subset of the ProZIB dataset collected by the Netherlands Comprehensive Cancer Organization (Integraal Kankercentrum Nederland; IKNL) from 65 Dutch hospitals.More information about the complete ProZIB dataset is described in the article by Vernooij et al. [15].This subset contained clinical information of 964 localized prostate cancer patients with their PROMs information based on the EPIC26 questionnaire (Supplementary materials) at diagnosis, 1 and 2 years after diagnosis.Only patients with localized or locally advanced prostate cancer were included in this study.Patients with regional lymph node involvement or distant metastasis (cT1-3, N0, M0) were excluded.Also, we excluded all patients with missing information and variables with more than 25 % missingness.All the patients in this study provided written informed consent to participate.
Patient information from each hospital was randomly assigned to the training or test data to develop the Bayesian network structures and test their performance, respectively.Tables 1 and 2 show the PROMS and clinical variables under consideration.Erectile dysfunction at one year was considered the endpoint of interest, dichotomized in patients with no sexual problems and patients with any sexual problems as shown in equation 1.The Sankey diagram (Fig. S1 supplementary materials) shows the flow changes in patients' frequency of erections at baseline and 1-year.

Bayesian network
Bayesian networks are graphical models called directed acyclic graphs (DAG) where the available variables of a system, domain, or organ under consideration are represented as nodes, and directed arrows represent probabilistic dependencies or causal relationships between the variables in a graphical [16,17].In a DAG, if an arrow or arc moves from node X to node Y, then node X has an influence on node Y. Hence, Nodes X is considered to be the parent of node Y, and node Y is said to be the child of nodes X.A variable can be a parent or child to several variables but can not act as a parent and child at the same time to another variable.The quantitative connection between parents and children in a DAG is provided by the conditional probability table (CPT), which shows the probabilistic state of a node based on the states of its parent node(s) [16,18].

Statistical analysis
Exploratory analyses and data visualization were applied to B. Osong et al. understand the data and detect any underlying patterns necessary to construct the Bayesian network, such as missing information and possible outlying values.All continuous variables were categorized either based on experts' opinions or from literature.All patients with  missing information and variables with missingness above 25 % were excluded from the analysis.From the 65 Dutch hospitals, we randomly reserved data from.45 hospitals (≈ 70 %) to train the CPT of the resulting Bayesian network structures and 20 hospitals (≈ 30 %) to test their performance.The predictive performance of the Bayesian network structures was assessed by computing the area under the curve (AUC) and generating calibration plots.All analyses were conducted in R version 4.2.1 [19] with the bnlearn package [20,21] to develop the Bayesian network structures and the Graphical Network Interface (GeNIe) application visualization [22].

Structure learning
Expert prostate cancer treating physicians specified the relationship between the variables in the Bayesian network based on the PROM and clinical information separately.Separate structures were also inferred from form each of the data sources using the hill-climbing (HC) structure learning algorithm and Bayesian Information Criterion for the goodnessof-fit measure.These algorithmic structures and literature searches were used to complement the expert-specified structures by either adding, removing, or reversing arcs in the expert structure (expert-modified structure).Also, variables without a link to the outcome based on the algorithmic structure were removed from the expert structure.The final structure is a combination of the expert-modified structures based on the algorithmic structures from both data sources.The performance of the resulting Bayesian network structures was evaluated by computing the area under the curve (AUC) and generating calibration plots.

Structure comparison
To test our hypothesis, we compared the performance of the expertmodified Bayesian network structures in predicting erectile dysfunction at one year for prostate cancer patients.To make the comparison fair, we used the same learning algorithm and goodness-of-fit measure to develop the algorithmic structure for both data sources, which are used to complement the expert structures.The final expert complimented structures were first compared structural appearance and then based on performance measures such as the AUC, sensitivity, and specificity values.Calibration plots were generated to compare which structure had its predicted probabilities closer to the observed frequency in data.

Results
Based on the data from the different Dutch hospitals, a total of 946 prostate cancer patients were considered for this study and grouped as follows.Data from the randomly selected 45 hospitals resulted in a total of 662 patients, which were assigned to train the structures, and the rest 20 hospitals with 284 patients to test the performance.From the train and test data, 157 and 68 patients with missing information were excluded from further analysis, reducing the cohort to 505 and 216, respectively.There was no statistical difference between patients in the training and test cohorts for all the considered variables.The BMI and volume of prostate via MRI variables were excluded from further analysis due to their high percentage of missing information.
The median age of patients in this study was 69 (45---87).Most of the patients in this study are nonsmokers or ex-smokers, with more than half of the patient population being alcohol users.Most of the patients in this study either received surgery or no therapy (watch and wait) with only a handful administered hormonal therapy.In general, most of the patients seem to be in "good health" since a majority of them had no CCI, diabetes, depression, or sexual problems.However, about 50 % of the patients had a cardiac problem.The outcome of interest was unbalanced, with 73 % and 75 % of the patients experiencing erectile dysfunction in the train and test data, respectively.The median prostatespecific antigen was 7.9, with a standard deviation of 13 in the training dataset.For BMI, the median value was 26.4 with a standard deviation of 2.7 which falls in the overweight category.The median value of the volume of prostate, on the other hand, was almost twice (48) the normal prostate volume on scan with a standard deviation of 24.8.Tables 1 and  2 show a detailed description of patients' characteristics for this study cohort.
There was a significant mean difference (p-value < 0.05) between patients with and without erectile dysfunction for age and prostatespecific antigen variable (Fig. S3 supplementary materials).The age variable was categorized into two groups with a cutoff value of 70 years.The prostate-specific antigen (PSA) was discretized into three groups based on experts' suggestions with cutoff values at 4 ng/ml and 10 ng/ ml (Fig. S1 supplementary materials).The number of patients without erectile dysfunction was higher in the"good sex life" groups (Patients who responded either, good, no problem, firm enough, or whenever to the questionnaire items) compared to patients who responded otherwise (Fig. S4 supplementary materials).However, the number of patients in the"good sex life" groups with erectile dysfunction was equally very high.A similar trend is also observed with the clinical variables as most of the patients without erectile dysfunction have a relatively better health condition or lifestyle except for the alcohol use variable, where most of the patients were in the drinking group and cardiovascular disease, which had almost the same number of patients (Fig. S5 supplementary materials).Based on treatment, most of the patients without erectile dysfunction did not receive any active therapy.
The experts specified clinical structure had all but four (PSA, Tstage, Nstage, and Gleason) variables being direct parents to the outcome.The algorithmic structure, on the other hand, the algorithmic DAG had two distinct structures, with only therapy being a parent to the outcome.Smoking status, alcohol use, diabetic status, CCI, and cardiovascular disease had no link to the outcome and were removed from the clinical expert-modified structure (Fig. S6 supplementary materials).The two structures were then compared to build the expert-modified structure by either removing (Therapy to Hormone therapy), maintaining (Gleason to therapy and therapy to outcome), reversing (PSA to hormone therapy and Age to Therapy), or adding (Tstage to Nstage and Age to PSA) certain arcs.The same procedure was followed to establish the PROM expert-modified (Fig. S7 supplementary materials) structure.

Structure comparison
Fig. 1 shows the resulting expert-modified Bayesian network structures.The clinical structure has 8 variables and 14 arcs, with the outcome having three parents.The PROM structure, on the other hand, has one variable less than the clinical structure with 10 arcs and three to the outcome.In the clinical structure, three variables (Therapy, Hormone therapy, and PSA) had 3 or more parents.In the PROM structure, only the outcome was a child to 3 variables.In both structures, Tstage and Ability to have erection were the only variables to parent more than 2 variables.Ability to reach orgasm was the only variable without a child besides the outcome.Ability to have erection was the most influential variable in the PROM structure, especially to Quality of erections and Frequency of erection.Quality of erections and Ability to have erection both highly influenced Ability to function sexually.On the other hand, all parents of therapy had a high influence, while only Gleason score and PSA had a high influence on Hormone therapy.Tstage had more influence on Nstage than on Hormone therapy.Table 3 shows the numerical comparison of the expert-modified Bayesian network structures based on their performance in terms of the AUC (Fig. S8 supplemental material), sensitivity, and specificity values for the different information sources in the training and test data.The PROM structure performed better than the expert structure for all matrices of interest, especially in the training data.However, both structures had the same accuracy and specificity value in the test data.
Fig. 2 shows the calibration plots of the expert-modified structures.Both structures seem to be well-calibrated in the train data.In the test data, the clinical structure looks to be better calibrated than the PROM structures since its predicted probabilities are more closely matched to the observed data frequencies.However, the clinical structure could not evaluate patients with higher probabilities but the PROM structure which could have a wider confidence interval.
The final structure after combining the PROM and clinical expertmodified structures is shown in Fig. 3 with the addition of two arcs (Age to Ability to function sexually and sexual functioning).Therefore, the final structure comprises 14 variables and 26 arcs with six to the outcome.Only age was a parent to four variables.
Fig. 4 shows the performance of the final structure in terms of AUC and calibration in the train and test data.The structure achieved an AUC value of 0.94 (0.92 -0.96) and 0.84 (0.77---91) for predicting the risk of erectile dysfunction at 1 year on the train and test data.The sensitivity and specificity values of the structure were 0.89 and 0.84, respectively, with an accuracy of 0.85 on the train data.In the test data, the sensitivity was 0.8 and the specificity 0.74, with an accuracy of 0.75.The calibration plots in Fig. 4 show the structure is well calibrated on the train data since predicted probabilities and the observed frequencies coincide almost at all points.On the test data, the structure seems to be poorly calibrated, given how distant the points are from the diagonal dotted gray line, which is the reference line for a perfectly calibrated structure.

Discussion
In the current study, we have developed and validated a Bayesian network structure that uses both clinical and PROM information to predict erectile dysfunction for prostate cancer patients.The structure was developed to be clinically sane and sparse at the same time by combining knowledge from expert prostate cancer treating physicians and evidence from data.That is, including only variables with a direct or indirect relationship with the outcome of interest based on the algorithmically generated structure.The developed structure used seven clinical features and five PROM information known to have an influence on the sexual life of a patient.The predictive performance of the structure based on the AUC values was above 80 % in both the training and test data.However, the structure was not well-calibrated in the test data as compared to the training data due to the imbalanced nature of the data.The PROM expert-modified structure with one variable and four arcs less was more performant than the clinical expert-modified structure for all considered matrices in both the train and test data.However, both structures had the same accuracy and specificity in the test data.
Deist et al. [23] have shown that the choice of a model should not be based on performance since all models can capture the basic signal within any data with little difference in performance (AUC value) when properly analyzed and compared.These findings have also been recently confirmed by Hasannejadasl et al. [24] and Biecek et al. [25] on a limited number of models.However, these studies did not include Bayesian networks in their model comparison.Although Jayasurya et al. [26] and Ducher et al. [27] have shown that Bayesian networks have a superior predictive performance on SVM and logistic regression to predict IgA nephropathy and two-year survival in lung cancer patients, respectively.Bayesian network structure was used in this study due to the complexity of erectile dysfunction, which can be caused by hormones, emotions, nerves, muscles, blood circulation, stress, mental health, or a combination of these factors.Therefore, graphical models such as Bayesian networks with unlimited restrictions in defining interactions between variables are more suited to model such systems as they can better capture the biological process and complexity within the system.In addition, they can be built by experts who better understand the biological process behind the disease for more biological/clinical plausibility because two variables can be correlated but not necessarily have a causal effect.Also, their ability to make inferences about any variable in the network is very pivotal for this project, as the developed structure can support decision-making as both a diagnostic and prognostic tool.Lastly, no study has investigated the predictive value of Bayesian networks for erectile dysfunction in prostate cancer patients thus far.
The process involved in getting an erection is very complex, which, if disrupted at any step, can lead to erectile dysfunction problems.This complexity makes it challenging to predict if a patient will suffer from erectile dysfunction after treatment.Nonetheless, Hasannejadasl et al. [12] have also looked at predicting erectile dysfunction using logistic regression but a direct comparison with this study will not be fair because of the difference in the study design and analysis process.However, both studies do agree that the Frequency of erections, Quality of erections, Therapy, and Hormone therapy have a direct influence on erectile dysfunction.Gleason group and Tumor Tstage were also included as having a direct relationship with erectile dysfunction, which in this study, are only related via the type of treatment received by the patient.Alcohol use, lack of energy, Cardiovascular disease, and Diabetes were also included in their model, which was not in our final expert-modified structure.This difference in selected variables can be directly attributed to the model type and data preparation steps, given that they considered missing information as a level for some variables, which increases the sample size and the chance of getting a significant

Table 3
The Accuracy, AUC, sensitivity, and specificity values of the expert-modified structure for the clinical and PROM data source on the train and test data.relationship with the outcome.The social variables smoking status and alcohol use were not included in the final network despite known evidence to have an effect on erectile dysfunction.Smoking status is known to significantly increase the risk of erectile dysfunction given that it damages endothelial cells and impairs endothelial vasodilation [28][29][30][31][32].However, their absence in the network can be understood because only about 6 % of the patients in this study are smokers.In addition, erectile function has been shown to greatly improve with smoking cessation, which accounts for about 39 % of the patient population in this study [32].Alcohol consumption, on the other hand, is known to have an effect on erectile dysfunction because it damages blood vessels and nerves [33] and with more than 70 % of the patients in this study being alcohol consumers, its absence in the network is somewhat surprising.However, one might justify its absence by the fact that alcohol only affects erectile function when taken in excess, and this information was not available in the data.On the contrary, it temporarily affects hormone levels and increases sex drive with moderate consumption, which might explain in part the "good sex life" of most of the patients.
Other variables like the Charlson comorbidity index, cardiovascular disease, diabetes, and feeling depressed were also not included in the final structure, which is understandable because more than 70 % of the patients did not suffer from diabetes, feel depressed, or had any comorbidity.Although one might argue that the small percentage of depressed patients is due to antidepressants, this information was not captured in this study, and the relatively "good sexual life" of the patients proves otherwise since antidepressant medication should have a negative impact on users such as sexual dysfunction [34].On the contrary, cardiovascular disease, which had about 50 % of the patients in the diseased group, was also not included in the final structure.Generally, the absence of these variables in the final structure can be attributed to the chosen goodness-of-fit measure for the learning algorithm, as it tends to prioritize sparse structures.However, Martell et al. [35] did not also find these excluded variables important to predict erectile dysfunction in a cohort of 212 patients receiving moderately hypofractionated radiotherapy to the prostate.
Most of the connections between the variables in the algorithmic structures (Supplementary materials Figs.S6 & S7) are either supported in the literature such as smoking status to alcohol use [36], Frequency of erections to erectile dysfunction [12], Therapy to erectile dysfunction [12], and diabetes to smoking status [37] or are clinically sane as Ability to function sexually is connected to Sexual functioning and Charlson Comorbidity Index is only linked to cardiovascular disease and Diabetes.This suggests that the HC algorithm combines well with the BIC scoring criteria as it penalizes structures with more parameters to avoid overfitting leading to a parsimonious structure that captures the signals (correlations) within the data better and hence can be used to complement the experts' structure.However, this parsimony caused a wellestablished variable (feeling depressed) known to affect erection, not to be connected to erectile dysfunction directly or indirectly.Generally, depression is expected to lead to erectile dysfunction because of the complex sex-related chemistry between the brain and the sex organs, which is greatly disrupted in depressed persons [38].
Based on the clinical structure, all the disconnected variables share similar risk factors that involve blood flow obstruction.Normally, these variables should be associated with erectile dysfunction since they all share the same arterial cause [39,40].One might argue that the separation is due to the opted scoring criteria, which penalize complex models more the simple ones.However, these variables were still not connected to erectile dysfunction even with more lenient scoring criteria like the Akaike information criterion (AIC).One plausible explanation for this disconnection could therefore be that an important variable, such as endothelial dysfunction not considered in this study, could be the link between these disconnected variables and erectile dysfunction.Another reason could be that these variables are not directly related to erectile dysfunction but their interplay with other hematological variables not captured in this study increases their influence on erectile dysfunction [39,40].
While it is important to develop accurate and performant models, it is also crucial that these models be sane, trustworthy, and interpretable, especially in a healthcare-related milieu.Therefore, using methods like Bayesian networks that can be developed by domain experts who understand better the biological process involved in the treatment of tumors and the aftermath would ensure that they are clinically sane [13] and their graphical nature facilitates their explanation to end users [41].The objective of this study was therefore attended as the proposed Bayesian network structure, which combines PROMs and clinical information, has a predictive performance well above the chance level with a clinically plausible structure that can be easily understood and interpreted by clinicians given them the freedom to focus on addressing the medical concerns of the patient rather than struggling with comprehension of the structure.Therefore our proposed structure can serve as a decision aid to support personalized treatment decisionmaking in routine clinical settings.Our results also show that PROM information is better than clinical information for predicting erectile dysfunction in prostate cancer patients.This confirms our hypothesis and further supports the need to include patients' perspectives in outcome research.
Our study is the first to assess the predictive value of Bayesian network structures for erectile dysfunction in prostate cancer patients to the best of our knowledge.Therefore, despite the performance of the Bayesian network structures, there is still room for improvement, especially in defining the relationship between the available variables.In addition, the performance was not optimal probability because we used disease-specific PROM information, which could restrict the structure's performance since it will not capture all the vital functional well-being and health status of the patient.Hasannejadasl et al. [12] used lack of energy in their nomogram to predict erectile dysfunction, which shows to influence the outcome greatly.Also, the expert structures were modified with algorithmic structures developed on a collection of retrospective clinical data, which might be biased.Therefore, proper testing and validation of the structures are warranted before clinical applicability can be considered, preferably prospectively, on routine clinical data.
The retrospective nature of this study caused possible important variables such as BMI and volume of prostate via MRI not to be considered because of the high percentage of missing information.Skrypnik et al. [42] have shown that erectile dysfunction has a direct relationship with BMI, and patients with a high BMI are at increased risk of developing erectile dysfunction.The volume of prostate via MRI, on the other hand, is believed to be related to erectile dysfunction because an enlarged prostate causes several urinary problems, which can easily be extended to erectile dysfunction.However, we did not find any study that has investigated or found a direct relationship between an enlarged prostate and erectile dysfunction.Therefore an enlarged prostate by itself may not be the cause of erectile dysfunction, but rather the treatment for the enlarged prostate or other factors associated with it, like age as the prostate continues to grow with age [43].Nonetheless, Collins et al. [43] found that prostate-specific antigen level is greatly influenced by the volume of the prostate.Such a link could have improved the performance of our final Bayesian network structure but was not included due to the retrospective nature of this study.
Given the pivotal role of blood in the process of erection, we believe that some hematological parameters such as Neutrophil-to-lymphocyte ratio, white-blood-cell count, lymphocyte count, and neutrophilic cell count, which are absent in the current study, could further improve the predictive performance of our Bayesian network [44].Lastly, the outcome variable was dichotomized which leads to a loss of information.Future studies should consider using the different levels of erectile dysfunction to help patients and caregivers make optimal treatment decisions.

Conclusion
We have developed and validated clinically plausible Bayesian network structures from routine clinical data, PROMs, and a combination of both data sources for predicting one-year erectile dysfunction in prostate cancer patients.The developed structures maximize performance and clinical plausibility as they complement experts' opinions with evidence from data.Our result showed that PROM information is pivotal and better at predicting erectile dysfunction at one-year for prostate cancer patients, which further accentuates the growing global desideratum of incorporating patients' perspectives in healthcare for quality and effective decision-making.However, combining both sources of information gives rise to a structure with improved discriminating power.In the future, we intend to update the developed Bayesian network structure with hematological parameters and then extend it to a dynamic structure to incorporate PROM information collected at different time points during treatment as this captures a more holistic view of the patient's journey.

Fig. 1 .
Fig. 1.Expert-modified Bayesian network structures using clinical and PROM, respectively.The oval structure represents variables or Node, and arrows show the direction of the causal-effect relationships.The arrow size indicates the magnitude of influence.

Fig. 2 .
Fig. 2. Calibration plots of the expert-modified Bayesian network structures in the training (left) and Test (right) data.The gray line represents ideal calibration, while the solid lines show the structure's performance, and the colored bandwidths indicate their respective 95% confidence interval.

Fig. 3 .Fig. 4 .
Fig. 3. Bayesian network structures using clinical (Blue) and PROM (green) information.The oval structure represents variables or Node, and arrows show the direction of the causal-effect relationships.The arrow size indicates the magnitude of influence.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 1
General PROMs characteristics at baseline on the training and validation datasets with the dichotomized outcome of interest at 1-year.

Table 2
General patient clinical characteristics at baseline on the training and validation datasets with the dichotomized outcome of interest at 1-year.